Goto

Collaborating Authors

 benchmarking robust machine learning


Noisy Ostracods: A Fine-Grained, Imbalanced Real-World Dataset for Benchmarking Robust Machine Learning and Label Correction Methods

Neural Information Processing Systems

We present the Noisy Ostracods, a noisy dataset for genus and species classificationof crustacean ostracods with specialists' annotations. Over the 71466 specimenscollected, 5.58% of them are estimated to be noisy (possibly problematic) at genuslevel. The dataset is created to addressing a real-world challenge: creating aclean fine-grained taxonomy dataset. The Noisy Ostracods dataset has diversenoises from multiple sources. Firstly, the noise is open-set, including new classesdiscovered during curation that were not part of the original annotation.